05. The OpenCV Computer Vision Library

The OpenCV Computer Vision Library

ND313 C03 L01 A08 C15 Intro

Throughout this course, you will be using the OpenCV , which is a cross-platform computer vision library which was originally developed in the year 2000 to provide a common infrastructure for computer vision applications and to accelerate the use of machine vision in science and engineering projects. Originally founded by Intel, the open-source library is now supported by several companies and hundreds of experts all over the globe.

The library has more than 2500 algorithms that can be used to detect and recognize faces, identify objects, classify human actions in videos, track camera movements, track moving objects, perform machine learning and many more. OpenCV is written natively in C++ but has interfaces to Python, Java and Matlab as well. In this course, you will be using the C++ version of OpenCV.

The major advantage in using the OpenCV library is that you will be able to leverage a well-tested set of state-of-the-art computer vision algorithms. Without having to concentrate on the actual implementation of computer vision concepts such as Sobel operators, keypoint detection or machine learning you can use them right out of the box and concentrate on combining them in the right way to develop a working software prototype. Despite this ease of use however, a good understanding of the theories behind those concepts is needed to use them correctly.

In the following, you will familiarize yourself with some basic concepts you will need to get started with OpenCV and to prepare yourself for the more advanced lessons later in the course. The libraries listed below will be used extensively throughout this lecture. They are however only a small part of the entire OpenCV. Later, you will also include some specialized libraries such as flann (Fast Library for Approximate Nearest Neighbors) or dnn (Deep Neural Networks), which will be described only in those sections of this course where they are used.

A note on namespaces: Most OpenCV functions exist within the cv namespace. Usually, to shorten the code, the using namespace cv command is used in many applications. In this course however, this is not done to make it clear when we are using function calls from the OpenCV.

OpenCV Library Overview

The core module is the section of the library that contains all of the basic object types and their operations. To use the library in your code, the following header has to be included:

#include "opencv2/core/core.hpp"

The highgui module contains user interface functions that can be used to display images or take simple user input. To use the library in your code, the following header has to be included:

#include "opencv2/highgui/highgui.hpp"

In this project, basic functions such as cv::imshow will be used to display images in a window.

The imgproc (image processing) module contains basic transformations on images, such as image filtering, geometric transformations, feature detection and tracking. To use the library in your code, the following header has to be included:

#include "opencv2/imgproc/imgproc.hpp"

The features2d module contains algorithms for detecting, describing, and matching keypoints between images. To use the library in your code, the following header has to be included:

#include "opencv2/features2d/features2d.hpp"

The OpenCV Matrix Datatype

The basic data type in OpenCV to store and manipulate images is the cv::Mat datatype . It can be used for arrays of any number of dimensions. The data stored in cv::Mat is arranged in a so-called raster scan order . For a two-dimensional array (such as a grayscale image), this means that the data is organized into rows, and each row appears one after the other. A three-dimensional array (e.g. a color image) is arranged in planes, where each plane is filled out row by row, and then the planes are packed one after the other. To see how this works, let us look into the cv::Mat datatype more deeply:

The data inside a cv::Mat variable can be either single numbers or multiple numbers. In the case of multiple numbers (e.g. represented by cv::Scalar ), the matrix is referred to as a multichannel array. There are several ways to create and initialize a cv::Mat variable. The create_matrix.cpp file in the workspace below illustrates one way how this can be done.

Note: To build and run the code below, use the following steps:

  1. Go to the virtual Desktop by clicking the Desktop button. You can use Terminator or a VSCode terminal to run the following commands:
  2. From the /home/workspace/OpenCV_exercises directory, run the commands: mkdir build && cd build
  3. cmake ..
  4. make
  5. Run the create_matrix executable from build with the command: ./create_matrix

Workspace

This section contains either a workspace (it can be a Jupyter Notebook workspace or an online code editor work space, etc.) and it cannot be automatically downloaded to be generated here. Please access the classroom with your account and manually download the workspace to your local machine. Note that for some courses, Udacity upload the workspace files onto https://github.com/udacity , so you may be able to download them there.

Workspace Information:

  • Default file path:
  • Workspace type: react
  • Opened files (when workspace is loaded): n/a
  • userCode:

    export CXX=g++-7
    export CXXFLAGS=-std=c++17

In the code example, the variable m18u is created with 480 rows and 640 columns with a color depth of 8 bit as unsigned char and a single channel (hence the _8UC1). Then, the entire image is set to the 8bit maximum value of 255, which corresponds to white. The function cv::imshow displays the image on the screen. When you execute the code, you should see a white image appear in a window on the screen.

Matrices in OpenCV can also be created with three channels to represent color.

Here is a short task for you : In the create_matrix.cpp file, create a variable of type cv::Mat named m3_8u which has three channels with a depth of 8bit per channel. Then, set the first channel to 255 using the cv::Scalar datatype and display the result. You can use the documentation here if you get stuck.

Exercise

In the create_matrix.cpp file, create a variable of type cv::Mat named m3_8u which has three channels with a depth of 8bit per channel. Then, set the first channel to 255 using the cv::Scalar datatype and display the result. Which color does the image have?

SOLUTION: Blue

Manipulating Matrices

Now that you can create matrices, let us try to change some of their entries: By using the command cv::Mat::at<data type>(row, col) = data the element at the given position can be replaced with data. Please note that the data type you provide to the at -function has to match the actual data stored in the matrix you are trying to access.

Here is another short task for you : In the change_pixels.cpp file, write a nested loop that runs over the entire width of the matrix in the example below. Then, set every element to 255. Take special care to select the correct data type for the given format. What does the resulting image look like?

Note: You can build and run your code for this task using the same steps as above, except for this exercise, the executable will be named change_pixels .

Code from the `change_pixels.cpp` file

Code from the change_pixels.cpp file

Manipulating Matrices

After writing the nested loop described above, what does the resulting image look like?

SOLUTION: A white bar from left to right.

Loading and Handling Images

The next thing we want to do is to load an image from file. Let us assume that the image resides in the same path as the executable. By calling cv::imread we can load the image from file and assign it to a cv::Mat variable. Take a look at the following code example to see how a single image can be loaded from file. You can build the code as above, and you can run the code from the virtual desktop using the load_image_1 executable.

Code from the `load_image_1.cpp` file

Code from the load_image_1.cpp file

Assuming that there are 5 images in total in the code directory (img0005.png - img0009.png) , they can easily be read from file one after the other using string concatenation. The next example shows how the filename can be easily assembled from single elements using string concatenation and the setfill-function, which ensures that the prepending zeros are added to the loop variable before appending it to the filename. You can run the next example using the load_image_2 executable.

Code from the `load_image_2.cpp` file

Code from the load_image_2.cpp file

Later in the course, we will load and process several images one after the other. It is important to handle large amounts of data in a smart way so that images and other structures are not needlessly copied. Also, we want to flexibly rearrange data as well as delete and append elements on a regular basis. In C++, this can easily be achieved by using vectors. In the following code, a set of images is loaded from file as before and pushed into a dynamic list of type vector<cv::Mat> . Then, an iterator is used to loop over the list and display the loaded images one by one.

You can run the code below using the load_image_3 executable.

Code from the `load_image_3.cpp` file

Code from the load_image_3.cpp file

The auto keyword is simply asking the compiler to deduce the type of the variable from the initialization, which is much more convenient than writing vector<cv::Mat>::iterator it instead. The current image within the loop can be accessed by using the *it expression.

Here is a last exercise for you : In the loop of load_image_3.cpp , prevent image number 7 from being displayed.

Summary

ND313 C03 L01 A09 C15 Outro